Skip to content

Add support for heterogeneous job group indices in SlurmExecutor#158

Merged
hemildesai merged 2 commits intomainfrom
hemil/het-group-slurm
Feb 19, 2025
Merged

Add support for heterogeneous job group indices in SlurmExecutor#158
hemildesai merged 2 commits intomainfrom
hemil/het-group-slurm

Conversation

@hemildesai
Copy link
Contributor

  • Introduce het_group_index parameter in ResourceRequest
  • Add het_group_indices parameter to SlurmExecutor
  • Implement validation and handling of heterogeneous job group indices
  • Modify Slurm batch request generation to use custom or default group indices

Example:

with run.Experiment("debug") as exp:
        executor.het_group_indices = [0, 1, 1]
        exp.add(
            [inline_script, inline_script, inline_script],
            tail_logs=True,
            executor=[executor, executor_2, executor_2],
        )

- Introduce het_group_index parameter in ResourceRequest
- Add het_group_indices parameter to SlurmExecutor
- Implement validation and handling of heterogeneous job group indices
- Modify Slurm batch request generation to use custom or default group indices

Signed-off-by: Hemil Desai <hemild@nvidia.com>
@hemildesai hemildesai requested a review from Kipok February 19, 2025 20:57
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Copy link
Contributor

@Kipok Kipok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@hemildesai hemildesai merged commit 5585ec6 into main Feb 19, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants